A Dictionary-Based Compressed Pattern Matching Algorithm
نویسندگان
چکیده
Compressed pattern matching refers to the process of, given a text in a compressed form and a pattern, finding all the occurrences of the pattern in the text without decompression. To utilize bandwidth more effectively in the Internet environment, it is highly desirable that data be kept and sent over the Internet in the compressed form. In order to support information retrieval for compressed data, compressed pattern matching has been gaining increasing attention from both theoretical and practical viewpoints. In this article, we design and implement a dictionary-based compressed pattern matching algorithm. Our algorithm takes advantage of the dictionary structure common in the LZ78 family. With the help of a slightly modified dictionary structure, we are able to do ‘block decompression’ (a key in many existing compressed pattern matching schemes) as well as pattern matching on-the-fly, resulting in performance improvement as our experimental results indicate.
منابع مشابه
A Unifying Framework for Compressed Pattern Matching
We introduce a general framework which is suitable to capture an essence of compressed pattern matching according to various dictionary based compressions. The goal is to find all occurrences of a pattern in a text without decompression, which is one of the most active topics in string matching. Our framework includes such compression methods as Lempel-Ziv family, (LZ77, LZSS, LZ78, LZW), byte-...
متن کاملA Boyer-Moore Type Algorithm for Compressed Pattern Matching
We apply the Boyer–Moore technique to compressed pattern matching for text string described in terms of collage system, which is a formal framework that captures various dictionary-based compression methods. For a subclass of collage systems that contain no truncation, our new algorithm runs in O(‖D‖ + n · m + m + r) time using O(‖D‖ + m) space, where ‖D‖ is the size of dictionary D, n is the c...
متن کاملMultiple Pattern Matching Algorithms on Collage System
Compressed pattern matching is one of the most active topics in string matching. The goal is to find all occurrences of a pattern in a compressed text without decompression. Various algorithms have been proposed depending on underlying compression methods in the last decade. Although some algorithms for multipattern searching on compressed text were also presented very recently, all of them are...
متن کاملCollage system: a unifying framework for compressed pattern matching
We introduce a general framework which is suitable to capture the essence of compressed pattern matching according to various dictionary-based compressions. It is a formal system to represent a string by a pair of dictionary D and sequence S of phrases in D. The basic operations are concatenation, truncation, and repetition. We also propose a compressed pattern matching algorithm for the framew...
متن کاملJPEG-LS Based Two-Dimensional Compressed Pattern Matching
With the phenomenal advances in data acquisition techniques via satellites and in medical diagnostics and forensic sciences, we have encountered a massive growth of image data. On account of efficiency (in terms of both space and time), there is a need to keep the data in compressed form for as much as possible, even when it is being searched. The class of images we are concerned in this paper ...
متن کامل